Experiments on sanity check

  • Author: Fang Zhang
  • Date: 2016.6.18
  • E-mail: fza34@sfu.ca

Experiment 20160618

  1. Merge all unresistant vcf files. 9772 SNPs in total.

    vcf-merge $(ls -1 *unresi.vcf.gz | perl -pe 's/\n/ /g') >merge.vcf
    
  2. Veryfy whether intersection between merge.vcf and Resi-List-MasterV27_.vcf is empty. (Indeed empty)

    cat merge.vcf | grep Mycobacterium |awk '{print $2}' > merge.txt
    cat ../../../input/RESI_SNPs/Resi-List-MasterV27_.vcf | grep Mycobacterium | awk '{print $2}' > resi.txt
    cat merge.txt resi.txt|sort -u|uniq -d
    
  3. For sample in different experiments, intersection should be in 95%. Results are as below:

sample GATK mpileup intersection resi unresi unrepeted repeted SNP_frequency>1 sample_intersec coverage sample_alias
ERR553274 1398 1652 1330 18 1312 1156 156 1145 1110 95.16 8380-11
ERR553275 1334 1479 1289 18 1271 1144 127 1132 1110 95.16 8380-11
ERR551956 1443 1602 1391 24 1367 1195 172 1193 1155 95.14 6844-06
ERR551957 1418 1567 1358 24 1334 1178 156 1176 1155 95.14 6844-06
ERR552910 1443 1531 1406 22 1384 1224 160 1224 1190 94.97 9827-01
ERR552911 1424 1545 1383 21 1362 1221 141 1219 1190 94.97 9827-01
ERR551977 1377 1523 1332 18 1314 1196 118 1191 1167 94.8 1934-03
ERR551978 1414 1548 1359 19 1340 1213 127 1207 1167 94.8 1934-03
ERR552549 1436 1603 1355 20 1335 1200 135 1192 1162 94.78 8374-11
ERR552550 1387 1437 1338 20 1318 1198 120 1196 1162 94.78 8374-11
ERR550643 1291 1397 1266 18 1248 1138 110 1135 1110 94.71 2009-08
ERR550644 1328 1490 1280 17 1263 1153 110 1147 1110 94.71 2009-08
ERR553081 1395 1544 1357 19 1338 1204 134 1201 1173 94.44 1500-03
ERR553082 1444 1679 1385 17 1368 1219 149 1214 1173 94.44 1500-03
ERR551167 1400 1459 1365 24 1341 1195 146 1195 1170 94.35 8886-01
ERR551168 1429 1528 1391 23 1368 1216 152 1215 1170 94.35 8886-01
ERR550782 1475 2140 1427 23 1404 1232 172 1220 1176 94.31 10107-01
ERR550783 1480 2117 1427 23 1404 1246 158 1203 1176 94.31 10107-01
ERR550946 1402 1566 1344 20 1324 1168 156 1163 1125 94.3 4403-05
ERR550947 1379 1576 1329 20 1309 1159 150 1155 1125 94.3 4403-05
ERR551155 1453 1665 1378 23 1355 1211 144 1204 1160 94.16 8389-11
ERR551156 1387 1467 1346 22 1324 1188 136 1188 1160 94.16 8389-11
ERR550658 1451 2033 1401 23 1378 1236 142 1195 1169 94.12 709-05
ERR550659 1516 2301 1469 23 1446 1280 166 1216 1169 94.12 709-05
ERR550940 1488 1629 1404 21 1383 1225 158 1221 1168 93.82 8383-11
ERR550941 1406 1547 1352 21 1331 1196 135 1192 1168 93.82 8383-11
ERR551680 1448 1593 1389 20 1369 1208 161 1208 1164 93.8 7453-10
ERR551681 1421 1530 1356 19 1337 1199 138 1197 1164 93.8 7453-10
ERR551369 1451 1600 1368 18 1350 1222 128 1219 1171 93.76 R04-0039
ERR551370 1426 1589 1355 19 1336 1205 131 1201 1171 93.76 R04-0039
ERR551854 1353 1511 1317 19 1298 1185 113 1182 1158 93.69 4296-09
ERR551855 1423 1571 1365 19 1346 1213 133 1212 1158 93.69 4296-09
ERR552939 1497 2482 1457 23 1434 1265 169 1214 1169 93.67 685-05
ERR552940 1556 2261 1510 23 1487 1342 145 1203 1169 93.67 685-05
ERR551821 1476 2098 1429 24 1405 1246 159 1222 1180 93.65 711-05
ERR551822 1502 2189 1440 23 1417 1252 165 1218 1180 93.65 711-05
ERR552493 1501 2251 1441 21 1420 1243 177 1218 1173 93.62 8888-01
ERR552494 1479 2021 1436 21 1415 1260 155 1208 1173 93.62 8888-01
ERR551693 1421 1573 1366 22 1344 1220 124 1218 1170 93.6 1511-02
ERR551694 1409 1517 1349 22 1327 1205 122 1202 1170 93.6 1511-02
ERR553156 1411 1552 1375 18 1357 1222 135 1217 1159 93.24 5569-09
ERR553157 1355 1507 1325 19 1306 1200 106 1185 1159 93.24 5569-09
ERR551927 1389 1492 1337 20 1317 1190 127 1185 1152 93.2 10737-02
ERR551928 1423 1540 1361 20 1341 1208 133 1203 1152 93.2 10737-02
ERR552444 1394 1622 1362 18 1344 1182 162 1177 1118 92.7 7461-10
ERR552445 1343 1579 1304 18 1286 1162 124 1147 1118 92.7 7461-10
ERR552894 1431 1559 1365 20 1345 1215 130 1210 1145 92.41 7426-10
ERR552895 1337 1440 1307 19 1288 1176 112 1174 1145 92.41 7426-10
ERR553303 1320 1386 1298 26 1272 1176 96 1167 1128 91.78 8073-07
ERR553304 1388 1467 1346 26 1320 1194 126 1190 1128 91.78 8073-07
ERR551184 1380 1558 1337 17 1320 1166 154 1157 1080 91.37 11251-09
ERR551185 1291 1465 1262 18 1244 1128 116 1105 1080 91.37 11251-09
ERR551804 1443 1722 1401 24 1377 1233 144 1206 1141 91.21 7942-05
ERR551805 1373 1476 1352 24 1328 1206 122 1190 1141 91.21 7942-05
ERR551806 1395 1557 1357 24 1333 1208 125 1192 1141 91.21 7942-05
ERR551556 1435 1755 1400 24 1376 1218 158 1201 1111 89.74 2389-05
ERR551557 1335 1433 1317 23 1294 1179 115 1165 1111 89.74 2389-05
ERR551558 1354 1589 1329 21 1308 1199 109 1174 1111 89.74 2389-05
ERR551360 1464 1741 1422 23 1399 1231 168 1209 1119 89.38 3811-05
ERR551361 1420 1570 1378 23 1355 1209 146 1187 1119 89.38 3811-05
ERR551362 1401 1632 1363 24 1339 1203 136 1179 1119 89.38 3811-05
ERR550777 1375 1585 1342 23 1319 1191 128 1174 1102 88.44 8455-05
ERR550778 1420 1565 1372 25 1347 1208 139 1200 1102 88.44 8455-05
ERR550779 1335 1526 1324 24 1300 1185 115 1172 1102 88.44 8455-05
ERR551943 1256 1339 1239 21 1218 1131 87 1129 1094 86.96 9052-05
ERR551944 1446 1580 1377 23 1354 1228 126 1220 1094 86.96 9052-05
ERR551945 1449 1537 1405 24 1381 1217 164 1216 1094 86.96 9052-05
ERR552689 1423 1533 1375 20 1355 1216 139 1207 1035 84.21 178-03
ERR552690 1211 1268 1176 18 1158 1073 85 1057 1035 84.21 178-03
ERR552668 1304 1419 1291 23 1268 1146 122 1146 964 77.55 12448-03
ERR552669 1269 1504 1248 21 1227 1140 87 1074 964 77.55 12448-03
ERR552670 1419 2009 1392 22 1370 1237 133 1176 964 77.55 12448-03
ERR552671 1440 2160 1393 21 1372 1243 129 1183 964 77.55 12448-03
ERR551979 1139 1196 1118 15 1103 1032 71 1008 698 56.61 7187-09
ERR551980 1295 1456 1272 18 1254 1150 104 1140 698 56.61 7187-09
ERR551981 1247 1304 1204 17 1187 1095 92 1065 698 56.61 7187-09
ERR551982 1353 1483 1312 18 1294 1177 117 1175 698 56.61 7187-09
ERR551983 1061 1116 1041 15 1026 953 73 938 698 56.61 7187-09

In [ ]: